We’re primarily concerned with the relationship between speaking rate and the amplitude (BA) of consonant bursts in onset CVs. We’re also interested in whether there are differences in how BA is reflected in languages that differ in how phonological voicing is implemented: English (long-lag VOT) and Indian Tamil (short-lag VOT).
Participants (English n=12; Tamil n=10) self-recorded CVs (at each place of articulation with /a/) at 4 different, self-regulated speeds (slow, normal, fast, very fast).
We looked at a few amplitude landmarks in the burst and following vowel:
Burst
Vowel
Difference measures served as dependent variables: 1. HiDiff–Av-Ahi 2. F23Diff–Av-Max23
And the temporal measurements: 1. v_dur: Vowel duration of the CV 2. VOT: time from onset of the burst to zero crossing of first periodic wave of the vowel
The distribution of v_dur was examined and log-transformed to ensure a more normal distribution –> but do I need it?
The density plots:
An the log(v_dur) density plot:
The predictions laid out in the introduction rely on there being a clear relationship between speaking rate and vowel duration. That is, it need not be the case that the more syllables a speaker speaks in a given time affect the duration of the syllabic nucleus. Increases in rate may simply result from a decrease in pauses between syllables. Given this complication we might appeal to the term “articulation rate”(Jacewicz, et al. 2009[LVC]), which captures the rate of speech production not including pauses between syllables. The impressionistic self-assessment of articulation rate as “slow,” “normal,” “fast,” and “very fast,” leads to a highly variable productions between speakers. In order to avoid bleaching the variation in articulation rate production (which inevitably occurs in designs that aggregate tokens according to impressionistic rate) a continuous measure such as vowel duration (in CVs) is desirable. While it need not be the case that vowels in CVs decrease in duration with increased rate, it remains an empirical question in the present data whether such a relationship is borne out.
Vowel duration data, which was log transformed to reflect a normal distribution, was modeled as a function of Rate and Language. Getting data and fixing some factors:
The model:
## Linear mixed-effects model fit by REML
## Data: rate
## AIC BIC logLik
## -14009.39 -13890.88 7023.697
##
## Random effects:
## Formula: ~1 + rate | sub
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 0.06343665 (Intr) rtnrml ratfst
## ratenormal 0.04699030 -0.670
## ratefast 0.06875148 -0.762 0.823
## ratevery fast 0.06861315 -0.850 0.624 0.867
## Residual 0.03612480
##
## Fixed effects: v_dur ~ rate * lang
## Value Std.Error DF t-value p-value
## (Intercept) 0.31368955 0.01841450 3762 17.034922 0.0000
## ratenormal -0.11262895 0.01396285 3762 -8.066329 0.0000
## ratefast -0.17885315 0.01999434 3762 -8.945191 0.0000
## ratevery fast -0.22730680 0.01995739 3762 -11.389606 0.0000
## langTamil -0.03388071 0.02814385 19 -1.203841 0.2434
## ratenormal:langTamil 0.08765063 0.02121950 3762 4.130665 0.0000
## ratefast:langTamil 0.08573001 0.03056328 3762 2.805000 0.0051
## ratevery fast:langTamil 0.06977217 0.03050288 3762 2.287396 0.0222
## Correlation:
## (Intr) rtnrml ratfst rtvryf lngTml rtnr:T rtfs:T
## ratenormal -0.662
## ratefast -0.763 0.807
## ratevery fast -0.849 0.615 0.863
## langTamil -0.654 0.433 0.499 0.555
## ratenormal:langTamil 0.435 -0.658 -0.531 -0.405 -0.666
## ratefast:langTamil 0.499 -0.528 -0.654 -0.565 -0.763 0.812
## ratevery fast:langTamil 0.555 -0.403 -0.565 -0.654 -0.849 0.619 0.863
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -9.50656601 -0.52674499 -0.05010961 0.48470575 5.81683675
##
## Number of Observations: 3789
## Number of Groups: 21
## NOTE: Results may be misleading due to involvement in interactions
## contrast estimate SE df t.ratio p.value
## slow - normal 0.0688 0.01061 3762 6.485 <.0001
## slow - fast 0.1360 0.01528 3762 8.899 <.0001
## slow - very fast 0.1924 0.01525 3762 12.617 <.0001
## normal - fast 0.0672 0.00910 3762 7.386 <.0001
## normal - very fast 0.1236 0.01203 3762 10.275 <.0001
## fast - very fast 0.0564 0.00798 3762 7.068 <.0001
##
## Results are averaged over the levels of: lang
## Degrees-of-freedom method: containment
## P value adjustment: tukey method for comparing a family of 4 estimates
## contrast effect.size SE df lower.CL upper.CL
## slow - normal 1.90 0.294 19 1.29 2.52
## slow - fast 3.76 0.423 19 2.88 4.65
## slow - very fast 5.33 0.422 19 4.44 6.21
## normal - fast 1.86 0.252 19 1.33 2.39
## normal - very fast 3.42 0.333 19 2.72 4.12
## fast - very fast 1.56 0.221 19 1.10 2.02
##
## Results are averaged over the levels of: lang
## sigma used for effect sizes: 0.03612
## Degrees-of-freedom method: inherited from containment when re-gridding
## Confidence level used: 0.95
The plot:
Here is the real data:
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ tibble 3.1.6 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.4 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ✓ purrr 0.3.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::collapse() masks nlme::collapse()
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
##
## The downloaded binary packages are in
## /var/folders/nv/72l7lgjs2s5d2y6bbl5wz6w40000gn/T//RtmpFqHLox/downloaded_packages
## Warning: Ignoring unknown parameters: notch, outlier.shape
The data were coded separately by language. We want to make a full model, so need to combine the dataframes.First clean up the Tamil data:
Now do something similar for English:
Make a new full data frame:
First we need to install “sjPlot” which will give linear model prediction plots with confidence intervals
Make a “lang” column in the full data frame. Make new column (log_vdur) because we’re taking log of duration measures (as suggested by a reviewer).
## Linear mixed-effects model fit by REML
## Data: full
## AIC BIC logLik
## 25671.91 25721.2 -12827.95
##
## Random effects:
## Formula: ~1 + logv_dur | sub
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 8.420798 (Intr)
## logv_dur 5.712429 0.689
## Residual 9.195737
##
## Fixed effects: F23Diff ~ logv_dur * lang
## Value Std.Error DF t-value p-value
## (Intercept) 15.463022 2.563361 3484 6.032323 0.0000
## logv_dur -5.983280 1.702555 3484 -3.514295 0.0004
## langTamil 5.518438 3.909974 20 1.411375 0.1735
## logv_dur:langTamil -0.110112 2.606735 3484 -0.042241 0.9663
## Correlation:
## (Intr) lgv_dr lngTml
## logv_dur 0.708
## langTamil -0.656 -0.464
## logv_dur:langTamil -0.462 -0.653 0.725
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.13179308 -0.67893680 0.01514047 0.60285624 4.43025576
##
## Number of Observations: 3508
## Number of Groups: 22
And the F23Diff plot for the full data set:
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
## Warning: Removed 2 row(s) containing missing values (geom_path).
## Warning: Removed 1 rows containing missing values (geom_point).
The plot confirms the model results: There is no significant language effect, though Tamil is overall 4.5dB higher (more amplitude difference between BA and V; quieter) in the F2/F3 range than English. The effect of logv_dur is evident (negative, decreasing by 6dB per unit of logv_dur). There is no interaction.
## Linear mixed-effects model fit by REML
## Data: full
## AIC BIC logLik
## 25974.42 26023.71 -12979.21
##
## Random effects:
## Formula: ~1 + logv_dur | sub
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 7.803985 (Intr)
## logv_dur 4.236696 0.54
## Residual 9.616089
##
## Fixed effects: HiDiff ~ logv_dur * lang
## Value Std.Error DF t-value p-value
## (Intercept) 22.614814 2.407110 3484 9.395008 0.0000
## logv_dur -4.870581 1.299811 3484 -3.747146 0.0002
## langTamil 10.666652 3.668565 20 2.907581 0.0087
## logv_dur:langTamil 1.095494 2.019957 3484 0.542335 0.5876
## Correlation:
## (Intr) lgv_dr lngTml
## logv_dur 0.589
## langTamil -0.656 -0.386
## logv_dur:langTamil -0.379 -0.643 0.617
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.4201684 -0.6773617 -0.0484311 0.6153636 4.3434950
##
## Number of Observations: 3508
## Number of Groups: 22
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
## Warning: Removed 2 row(s) containing missing values (geom_path).
## Warning: Removed 2 rows containing missing values (geom_point).
The plot confirms that HiDiff is higher (lower BA) in Tamil (11.69dB) than English. The effect of logv_dur on HiDiff is the same in both languages (-4.83dB/unit logv_dur).
Now let’s make a new FULL data frame with just p-s and k-s (both are in English and Tamil) [Not executed here]
full_pk <- full[which(full$POA=="1" | full$POA=="3"),]
We’re going to model the difference between p and k as a function of vowel duration in the two languages
pk_model <- lme(HiDiff ~ POA * logv_dur * lang, full_pk, random = ~1 + logv_dur | sub)
summary(pk_model)
plot_model(pk_model, type="pred", terms = c("logv_dur","POA", "lang"))
Next we model F23Diff and HiDiff in each language to examine the effects of place of articulation (POA) as well as speaking rate (logv_dur).
## Linear mixed-effects model fit by REML
## Data: eng
## AIC BIC logLik
## 13267.36 13323.59 -6623.681
##
## Random effects:
## Formula: ~1 + logv_dur | sub
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 5.721220 (Intr)
## logv_dur 1.752705 0.604
## Residual 6.027890
##
## Fixed effects: F23Diff ~ POA * logv_dur
## Value Std.Error DF t-value p-value
## (Intercept) 18.049802 1.8327098 2033 9.848696 0
## POAt 4.789144 1.0692112 2033 4.479138 0
## POAk -6.721391 1.0967030 2033 -6.128725 0
## logv_dur -7.392485 0.6350577 2033 -11.640650 0
## POAt:logv_dur 5.383440 0.5023528 2033 10.716453 0
## POAk:logv_dur 3.063838 0.5292525 2033 5.788990 0
## Correlation:
## (Intr) POAt POAk lgv_dr POAt:_
## POAt -0.278
## POAk -0.275 0.475
## logv_dur 0.684 -0.359 -0.357
## POAt:logv_dur -0.260 0.953 0.449 -0.367
## POAk:logv_dur -0.251 0.437 0.953 -0.355 0.453
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.189331196 -0.660845533 -0.008732887 0.647414422 4.188868778
##
## Number of Observations: 2050
## Number of Groups: 12
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
## Warning: Removed 6 row(s) containing missing values (geom_path).
## Warning: Removed 2 rows containing missing values (geom_text).
## Linear mixed-effects model fit by REML
## Data: eng
## AIC BIC logLik
## 13339.8 13396.02 -6659.898
##
## Random effects:
## Formula: ~1 + logv_dur | sub
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 7.837531 (Intr)
## logv_dur 3.006828 0.859
## Residual 6.125587
##
## Fixed effects: HiDiff ~ POA * logv_dur
## Value Std.Error DF t-value p-value
## (Intercept) 21.977997 2.4035900 2033 9.143821 0.0000
## POAt 3.319602 1.0876172 2033 3.052178 0.0023
## POAk 3.234407 1.1158782 2033 2.898531 0.0038
## logv_dur -8.306631 0.9528595 2033 -8.717581 0.0000
## POAt:logv_dur 8.846597 0.5108015 2033 17.319049 0.0000
## POAk:logv_dur 4.990986 0.5384835 2033 9.268595 0.0000
## Correlation:
## (Intr) POAt POAk lgv_dr POAt:_
## POAt -0.216
## POAk -0.214 0.475
## logv_dur 0.870 -0.244 -0.243
## POAt:logv_dur -0.202 0.953 0.449 -0.249
## POAk:logv_dur -0.195 0.437 0.953 -0.242 0.453
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.625054357 -0.645872405 0.001457556 0.606850629 3.705172369
##
## Number of Observations: 2050
## Number of Groups: 12
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
## Warning: Removed 6 row(s) containing missing values (geom_path).
## Warning: Removed 2 rows containing missing values (geom_text).
## Linear mixed-effects model fit by REML
## Data: tam
## AIC BIC logLik
## 10018.72 10082.07 -4997.358
##
## Random effects:
## Formula: ~1 + logv_dur | sub
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 10.781643 (Intr)
## logv_dur 5.602153 0.69
## Residual 7.314837
##
## Fixed effects: F23Diff ~ POA * logv_dur
## Value Std.Error DF t-value p-value
## (Intercept) 35.60760 3.775104 1441 9.432218 0.0000
## POAt̪ -11.17536 2.015946 1441 -5.543482 0.0000
## POAʈ -16.33032 2.091883 1441 -7.806517 0.0000
## POAk -18.58287 2.032474 1441 -9.142981 0.0000
## logv_dur -3.95131 2.003884 1441 -1.971826 0.0488
## POAt̪:logv_dur -1.99926 1.118121 1441 -1.788054 0.0740
## POAʈ:logv_dur -0.29085 1.179988 1441 -0.246485 0.8053
## POAk:logv_dur 1.07846 1.137651 1441 0.947970 0.3433
## Correlation:
## Warning in abbreviate(colnames(x), minlength = rdig + 3): abbreviate used with
## non-ASCII chars
## (Intr) POAt̪ POAʈ POAk lgv_dr POA̪:_ POAʈ:_
## POAt̪ -0.254
## POAʈ -0.255 0.465
## POAk -0.266 0.483 0.472
## logv_dur 0.744 -0.251 -0.252 -0.265
## POAt̪:logv_dur -0.238 0.963 0.439 0.455 -0.254
## POAʈ:logv_dur -0.238 0.432 0.965 0.437 -0.255 0.442
## POAk:logv_dur -0.245 0.452 0.442 0.962 -0.262 0.463 0.444
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.34805541 -0.61373420 -0.03676703 0.59083622 4.18121390
##
## Number of Observations: 1458
## Number of Groups: 10
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
First make a model that doesn’t have logv_dur as an interaction factor but as an additive factor.
## contrast estimate SE df t.ratio p.value
## p - t̪ 7.74 0.547 1444 14.152 <.0001
## p - ʈ 15.82 0.551 1444 28.708 <.0001
## p - k 20.38 0.557 1444 36.585 <.0001
## t̪ - ʈ 8.09 0.545 1444 14.835 <.0001
## t̪ - k 12.64 0.548 1444 23.045 <.0001
## ʈ - k 4.55 0.548 1444 8.307 <.0001
##
## Degrees-of-freedom method: containment
## P value adjustment: tukey method for comparing a family of 4 estimates
## contrast effect.size SE df lower.CL upper.CL
## p - t̪ 1.056 0.0746 9 0.887 1.225
## p - ʈ 2.160 0.0752 9 1.990 2.330
## p - k 2.782 0.0760 9 2.610 2.954
## t̪ - ʈ 1.104 0.0744 9 0.936 1.272
## t̪ - k 1.725 0.0749 9 1.556 1.895
## ʈ - k 0.621 0.0748 9 0.452 0.791
##
## sigma used for effect sizes: 7.325
## Degrees-of-freedom method: inherited from containment when re-gridding
## Confidence level used: 0.95
## Linear mixed-effects model fit by REML
## Data: tam
## AIC BIC logLik
## 10374.18 10437.54 -5175.092
##
## Random effects:
## Formula: ~1 + logv_dur | sub
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 13.801240 (Intr)
## logv_dur 6.800062 0.784
## Residual 8.267072
##
## Fixed effects: HiDiff ~ POA * logv_dur
## Value Std.Error DF t-value p-value
## (Intercept) 49.02991 4.737483 1441 10.349359 0.0000
## POAk -11.32368 2.278459 1441 -4.969888 0.0000
## POAʈ -22.02979 2.364418 1441 -9.317212 0.0000
## POAt̪ -15.11697 2.297717 1441 -6.579125 0.0000
## logv_dur -0.98952 2.400413 1441 -0.412229 0.6802
## POAk:logv_dur 0.79510 1.263700 1441 0.629186 0.5293
## POAʈ:logv_dur -2.98440 1.333731 1441 -2.237634 0.0254
## POAt̪:logv_dur -0.29805 1.285957 1441 -0.231774 0.8167
## Correlation:
## Warning in abbreviate(colnames(x), minlength = rdig + 3): abbreviate used with
## non-ASCII chars
## (Intr) POAk POAʈ POAt̪ lgv_dr POAk:_ POAʈ:_
## POAk -0.229
## POAʈ -0.230 0.465
## POAt̪ -0.240 0.483 0.471
## logv_dur 0.814 -0.237 -0.238 -0.250
## POAk:logv_dur -0.214 0.963 0.439 0.455 -0.240
## POAʈ:logv_dur -0.215 0.432 0.965 0.437 -0.241 0.442
## POAt̪:logv_dur -0.221 0.452 0.442 0.962 -0.248 0.463 0.444
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.58958600 -0.65006391 -0.06205775 0.64230112 3.78102861
##
## Number of Observations: 1458
## Number of Groups: 10
First make a model that doesn’t have logv_dur as an interaction factor but as an additive factor.
## Linear mixed-effects model fit by REML
## Data: tam
## AIC BIC logLik
## 10383.01 10430.54 -5182.504
##
## Random effects:
## Formula: ~1 + logv_dur | sub
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 13.677256 (Intr)
## logv_dur 6.671304 0.781
## Residual 8.283415
##
## Fixed effects: HiDiff ~ POA + logv_dur
## Value Std.Error DF t-value p-value
## (Intercept) 47.97902 4.520417 1444 10.613848 0.0000
## POAk -12.65826 0.618307 1444 -20.472469 0.0000
## POAʈ -16.99306 0.623348 1444 -27.260979 0.0000
## POAt̪ -14.57239 0.630006 1444 -23.130556 0.0000
## logv_dur -1.59537 2.246891 1444 -0.710034 0.4778
## Correlation:
## (Intr) POAk POAʈ POAt̪
## POAk -0.082
## POAʈ -0.081 0.507
## POAt̪ -0.098 0.506 0.511
## logv_dur 0.795 -0.016 -0.009 -0.036
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.57433144 -0.64924701 -0.06583899 0.63756926 3.82317152
##
## Number of Observations: 1458
## Number of Groups: 10
Then the pairwise comparisons:
## contrast estimate SE df t.ratio p.value
## p - k 12.66 0.618 1444 20.472 <.0001
## p - ʈ 16.99 0.623 1444 27.261 <.0001
## p - t̪ 14.57 0.630 1444 23.131 <.0001
## k - ʈ 4.33 0.616 1444 7.032 <.0001
## k - t̪ 1.91 0.620 1444 3.086 0.0111
## ʈ - t̪ -2.42 0.620 1444 -3.906 0.0006
##
## Degrees-of-freedom method: containment
## P value adjustment: tukey method for comparing a family of 4 estimates
## contrast effect.size SE df lower.CL upper.CL
## p - k 1.528 0.0746 9 1.3593 1.697
## p - ʈ 2.051 0.0753 9 1.8812 2.222
## p - t̪ 1.759 0.0761 9 1.5872 1.931
## k - ʈ 0.523 0.0744 9 0.3550 0.692
## k - t̪ 0.231 0.0749 9 0.0617 0.400
## ʈ - t̪ -0.292 0.0748 9 -0.4615 -0.123
##
## sigma used for effect sizes: 8.283
## Degrees-of-freedom method: inherited from containment when re-gridding
## Confidence level used: 0.95